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Abstract 

Big qualitative data (Big Qual), or research involving large qualitative data sets, has introduced many newly evolving conventions 
that have begun to change the fundamental nature of some qualitative research. In this methodological essay, we first distinguish 
big data from big qual. We define big qual as data sets containing either primary or secondary qualitative data from at least 100 
participants analyzed by teams of researchers, often funded by a government agency or private foundation, conducted either as a 
stand-alone project or in conjunction with a large quantitative study. We then present a broad debate about the extent to which 
big qual may be transforming some forms of qualitative inquiry. We present three questions, which examine the extent to which 
large qualitative data sets offer both constraints and opportunities for innovation related to funded research, sampling strategies, 
team-based analysis, and computer-assisted qualitative data analysis software (CAQDAS). The debate is framed by four related 
trends to which we attribute the rise of big qual: the rise of big quantitative data, the growing legitimacy of qualitative and mixed 
methods work in the research community, technological advances in CAQDAS, and the willingness of government and private 


foundations to fund large qualitative projects. 
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The term “big qualitative data (big qual),” or research involv- 
ing large qualitative data sets, was likely borrowed from the 
term “big data.” Big Data typically refers to the large quanti- 
tative data sets increasingly used by academic researchers, 
government and nonprofit agencies, the private sector, and 
nonacademic political researchers. Big quantitative data has 
been the subject of a vigorous public debate related to individ- 
ual privacy rights and the appropriate analysis and interpreta- 
tion of such data (e.g., Brooks, 2013; Ohm, 2012; Shah, Horne, 
& Capella, 2012). The original conception of big data tended to 
assume that data constituted numbers not words and images 
(Diebold, 2003, 2012). After big quantitative data emerged, 
however, many newly introduced conventions for big qual 
were also developed that began to change the fundamental 
nature of some qualitative research. The purpose of this meth- 
odological essay is to distinguish big qual from big data and 
then to present a broad debate about the extent to which big 
qual may be transforming some forms of qualitative inquiry. 
Before the term big qual emerged, one of the earliest exam- 
ples of large-scale mixed methods research projects was the 
Framington Heart Study, a medical study begun in 1948 with 
5,209 participants that has continued to the present day with 


several new cohorts of participants including many of the chil- 
dren and grandchildren of the original participants (Levy & 
Brink, 2005). Over 1,000 medical papers have been published 
from the Framington data (Mahmood, Levy, Vasan, & Wang, 
2014). While the Framington study was primarily quantitative, 
interviews were also conducted to measure psychosocial health 
factors such as anxiety, depression, social support, and hostility 
(K. Davidson, MacGregor, Stuhr, Dixon, & MacLean, 2000). 
Many large-scale mixed methods medical studies followed the 
Framington Heart Study (Plano Clark, 2010) including the 
present-day Precision Medicine Initiative (PMI), a mixed 
methods medical study conducted by the National Institutes 
of Health (NIH), whose goal is to enroll 1 million participants 
to study the role of genetics and lifestyle in health outcomes 
(Collins & Varmus, 2015). Though the PMI is a mixed methods 
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study, it is as yet unclear the degree to which the research 
design will incorporate both primary and secondary qualitative 
data as the project evolves. 

While a comprehensive big qual literature review is beyond 
the scope of this methodological essay, our initial review of 
large-scale qualitative and mixed methods studies conducted in 
the last 5 years uncovered research in a broad range of disci- 
plines including agriculture (e.g., Charatsari & Papadaki- 
Klavdianou, 2017), business (e.g., St-Hilaire, Gilbert, & 
Lefebvre, 2018), environmental protection (e.g., Lynn, 2017), 
health and medicine (e.g., Hurst et al., 2016; Jenkins, Slemon, 
Haines-Saah, & Oliffe, 2018; Mayberry, 2016), public safety 
(e.g., Kerrison, Cobbina, & Bender, 2018), sociology and 
anthropology (e.g., Knight, Cottrell, Pickering, Bohren, & 
Bright, 2017; Manning & Greenwood, 2018; Reed, Strzyzy- 
kowski, Chiaramonte, & Miller, 2018), and education (e.g., 
Brower et al., 2017; Rutledge, Cohen-Vogel, & Osborne- 
Lampkin, 2012; Calma, 2013; Eta, Kallo, & Rinne, 2018; 
LaPointe-McEwan, DeLuca, & Klinger, 2017). Our initial 
review also showed that fewer than half of big qual studies 
involved primary data collection in the field. Many secondary 
data big qual studies in our review involved data downloaded 
from social media sources such as Facebook or Twitter (e.g., 
Greene, Choudhry, Kilabuk, & Shrank, 2011), qualitative data 
drawn from open-ended comment box questions on quantita- 
tive surveys (e.g., Elsesser & Lever, 2011), consumer research 
conducted by the private sector or political research conducted 
by nonacademic researchers (e.g., Clow & James, 2010), and 
content analysis conducted through computerized text-mining 
techniques (e.g., Guest & MacQueen, 2008). 

Big qual can also be aligned with “rapid qualitative inquiry” 
(Beebe, 2014) and “multi-sited ethnography” (Coleman & von 
Hellermann, 2011), though we include in our definition both 
“slow” and “rapid” methods and qualitative traditions beyond 
ethnography including, for instance, case study, narrative 
inquiry, and grounded theory. Our review revealed that the 
most common research tradition for big qual was the case study 
(e.g., Brower et al., 2017; Calma, 2013). 

Based on our review, we define big qual as data sets contain- 
ing either secondary qualitative data or primary data with at 
least 100 participants, analyzed by teams of researchers, often 
funded by a government agency or private foundation, and 
conducted either as a stand-alone project or in conjunction with 
a large quantitative study.' 

Saldana (2013) observed that “a metacognition of method, 
even in an emergent, intuitive, inductive-oriented, and socially 
conscious enterprise such as qualitative inquiry, is vitally 
important” (p. 40). This methodological essay is intended to 
ask metacognitive questions about big qual by making issues 
explicit that have thus far remained largely implicit based on 
our review of the qualitative research methods literature. Not 
unlike case study research, we have found big qual designs to 
be very flexible in terms of how they can be combined with 
other qualitative traditions. Nonetheless, currently big qual is a 
collection of methods that lacks the rich philosophical history 
and broader application we find in qualitative traditions such as 
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phenomenology or grounded theory. Therefore, we hope to 
initiate a debate about whether big qual might someday be 
grounded in a deeper philosophy. Before presenting our dis- 
cussion, we first must acknowledge how our work with large 
qualitative data sets contextualizes the debate presented here. 


Research Contexts 


The research context for this methodological essay was an 
ongoing 5-year mixed methods research project of a major 
policy shift in the delivery of developmental education (DE 
or remediation) in the 28 state colleges in Florida (Brower, 
Bertrand Jones, Hu, & Park-Gaghan, in press; Brower et al., 
2017; Mokher, Spencer, Park, & Hu, 2019; Nix, Bertrand 
Jones, Brower, & Hu, in press; Park-Gaghan et al., in press; 
Park, Woods, Hu, Bertrand Jones, & Tandberg, 2018; Woods, 
Hu, Bertrand Jones, & Tandberg, 2018, 2019). The qualitative 
research methods for the DE project were informed by the work 
of researchers from two previous K-12 projects. One project 
was part of a multiyear research and reform effort focused on 
identifying the combination of essential components and pro- 
grams, practices, processes, and policies that make some high 
schools in large urban districts particularly effective with stu- 
dents from traditionally low-performing subgroups (Rutledge, 
Cohen-Vogel, & Osborne-Lampkin, 2012). The second 
explored the implementation of district programs used to train 
and certify school leaders in Florida’s 67 districts (Rutledge, 
Cohen-Vogel, Osborne-Lampkin, & Roberts, 2015). Our per- 
spective on big qual is grounded in our work on these projects 
and supported by our individual and collective interrogation of 
our data collection and analysis processes. We present a sum- 
mary of our process for the DE project, specifically, to provide 
context for our debate. 

As a mixed methods research project, the qualitative 
research team on the DE project collaborated with the quanti- 
tative research team through frequent meetings and informal 
discussions. Research findings from the quantitative team 
informed questions that were asked in successive iterations of 
the qualitative interview protocols and qualitative findings 
informed questions asked on annual surveys administered by 
the quantitative research team. 

Over the course of 5 years, the overarching qualitative 
research question shifted from policy implementation pro- 
cesses to promising institutional practices in community col- 
leges to organizational change and transformation. Our 
qualitative sampling strategy was a maximum variation sample 
at both the institution level and the individual level, which 
involved “purposely picking a wide range of cases to get var- 
iation on dimensions of interest” (Patton, 2015, p. 267). The 21 
institutions in our 5-year sample represented the majority of 
institutions in the Florida College System. Institutions were 
located in every region of the state and differed by enrollment 
size, location (1.e., rural, suburban, and urban), and average 
performance on student outcome measures from the quantita- 
tive data set. At the individual level, we sampled different types 
of campus personnel including presidents, administrators, 
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faculty, advisors, and support staff as well as students reflecting 
the diversity of community college student populations (i.e., 
students of color, veterans, English-language learners, immi- 
grant students, undocumented students, parents and working 
adults, students with disabilities, first-generation college stu- 
dents, economically disadvantaged students, LGBTQ students, 
homeless students, and formerly incarcerated students). We 
also conducted interviews with state legislators and external 
policy stakeholders. 

Our data for the DE project consisted of field notes, verba- 
tim transcripts from focus groups and individual interviews, 
and institutional documents collected on site visits to 21 state 
colleges in Florida (with some repeat visits) over a 5-year 
period. To date, we have conducted 42 site visits and 166 focus 
groups comprising over 1,100 total participants. 

Given the volume of data, the project consisted of a team of 
researchers (ranging from five to six) who conducted data anal- 
ysis. The research team met weekly to share findings and dis- 
cuss the analysis process. Our data analysis process employed 
pattern coding to identify central concepts and properties in the 
data (Corbin & Strauss, 2008; Miles, Huberman, & Saldafia, 
2014). To begin the process, one researcher read the verbatim 
transcripts to ascertain where the subject of the participants’ 
response had changed and a new paragraph should begin. This 
process was, in effect, a form of precoding. We then read 
through the field notes, institutional documents, and focus 
group data to synopsize the chronology of institutional 
processes. 

To establish inter-coder reliability, a number of specific 
analytic processes were necessary. Each of the codes in our 
coding frameworks needed consistent definitions in a codebook 
that researchers could refer to frequently. Our codebook de- 
emphasized codes reflecting highly theoretical or abstract con- 
cepts due to the likelihood that codes would be interpreted 
differently by different researchers. After we coded a subset 
of the data, we ran the Cohen’s « coefficient function in NVivo, 
a computer-assisted qualitative data analysis software (CAQ- 
DAS) program. In the first round of reliability testing, we eval- 
uated two pairs of researchers, comparing the coding of an 
individual researcher with the coding of another individual 
researcher. In the second round of reliability testing, we fol- 
lowed the same process by comparing two sets of researchers. 

We then initiated pattern coding. In this process, we devel- 
oped an a priori coding framework with codes at three levels 
(i.e., parent, child, and grandchild nodes) based on our initial 
reading of the data. During this process, we identified addi- 
tional emergent themes not captured under existing codes, 
resulting in additional codes. 

Our data collection and analysis efforts were iterative. Dur- 
ing the subsequent years of the developmental education proj- 
ect, we refined our field note observation forms and focus 
group protocols based on themes that emerged from the first 
round of data collection. We used our coding frameworks from 
the first year of data analysis as a starting point for the code- 
book in the second year. To do this, we subdivided some of the 
most frequently used codes into child and grandchild codes and 


collapsed some infrequently used codes into the parent codes. 
We also added codes for emergent themes that had not been 
coded in the first round of coding. For instance, an a priori 
parent code included “students.” This code was subdivided into 
“students, general” and “student populations.” However, infre- 
quently used grandchild codes “students, off-campus work” 
and “students, on-campus work” were collapsed into “student 
work.” 

This process was repeated each year to encompass emergent 
themes and reflect the overall project’s shifting research ques- 
tion. Our discussions of the process as outlined above provided 
the foundation for the questions we identify below. 


Debating Big Qual 


The debate presented here is framed by four related trends to 
which we attribute the rise of big qual: the rise of big quanti- 
tative data, the growing legitimacy of qualitative and mixed 
methods work in the research community (Creswell & Clark, 
2011), technological advances in CAQDAS (Bazeley & Jack- 
son, 2013; J. Davidson, Paulus, & Jackson, 2016), and the 
willingness of government and private foundations to fund 
large qualitative projects (Cheek, 2008; Plano Clark, 2010). 
For each question below, we present the opportunities for inno- 
vation as well as the constraints and challenges in research 
designs for large qualitative data sets that have emerged from 
our work. 


What Opportunities and Constraints Are Presented by 
Funded Research Involving Big Qual? 


Because big qual can be costly to conduct, many, though not all 
research projects involving large qualitative data sets have been 
funded by a government agency or private foundation (Cheek, 
2008; Plano Clark, 2010). This can be a potential constraint in 
creating iterative research designs. Miles, Huberman, and Sal- 
dafia (2014) have contrasted the linear and sequential nature of 
quantitative research methods with the iterative and cyclical 
nature of qualitative inquiry. Indeed, some qualitative tradi- 
tions such as grounded theory (Charmaz, 2014; Corbin & 
Strauss, 2008; Denzin & Lincoln, 2017) employ theoretical 
sampling methods, which study emergent constructs and social 
phenomena through alternating cycles of data collection and 
data analysis. 


Funding, induction, and iteration. Patton (2015) has commented on 
the uncertainty inherent in funded qualitative research due to 
balancing the funders’ need for an intentional, structured, and 
systematic research plan with the flexibility necessary to 
explore emergent themes that invariably arise during the 
research process: 


How will they [funders] know what will result from the inquiry if 
the design is only partially specified? The answer is: They won’t 
know with any certainty. All they can do is look at the results of 
similar qualitative inquiries, inspect the reasonableness of the 


overall strategies in the proposed design, and consider the capacity 
of the researcher to fruitfully undertake the proposed study. (p. 44) 


Not all big qual research projects are longitudinal or take place 
over multiple years. However, in the context of the DE 
research, our experience is that large qualitative projects can 
be equally or more inductive and iterative than small qualita- 
tive projects, but the iteration may unfold over several years of 
a project rather than within a shorter time frame. In addition, 
teams of researchers bring new concerns and insights that 
inform the unfolding analytic process. In practical and logisti- 
cal terms, it can be challenging to have multiple iterations of 
data collection and analysis within a year. It is costly to return 
to the field to collect more data multiple times, and it is difficult 
to coordinate the schedules of teams of researchers and field 
sites. Between years of a big qual project, research designs can 
change significantly as new research questions emerge from 
the data or minor adjustments are made to the existing research 
design. With the shifting focus of the DE project, the data 
collection and data analysis plans were improved yearly, and 
the annual schedule of data collection, data analysis, and writ- 
ing was adjusted and refined to better reflect the time necessary 
to complete each phase of the project and the resulting subsid- 
lary research questions. 


The qualitative story line. Another constraint in funded big qual 
projects is related to capacity issues. Because funding enables 
researchers to collect more data, funded projects can generate 
such a large quantity of data that data reduction techniques 
become essential. With this volume of data, it can be challen- 
ging to separate the “noise” from the main “story line” in the 
data, making it difficult to answer the question: What are the 
data actually saying? In some instances, this required us to 
employ counting techniques in the DE project and then to 
provide the rationales and methods of counting identified by 
other qualitative researchers (Hannah & Lautsch, 2011). 

In addition, big qual can present challenges for reporting 
theories and findings in a cohesive narrative when drawn from 
so much data. Collectively, we have learned to report the theory 
and research findings generated from big qual in a variety of 
ways, including single case studies with sections linking indi- 
viduals’ lived experiences within broad institution-level or 
system-level patterns (e.g., Brower, Bertrand Jones, & Hu, 
2018; Nix et al., in press; Rutledge et al., 2015), and multiple 
case studies with vignettes illustrating findings within cases 
coupled with figures and/or tables summarizing patterns across 
all cases (e.g., Brower et al., in press; Brower, Mokher, Ber- 
trand Jones, Cox, & Hu, 2019; Cohen-Vogel, Rutledge, & 
Osborne-Lampkin, 2011; Rutledge et al., 2015), and individual 
examples nested within composite institutions (e.g., Arnault, 
2002; Brower et al., 2017; Conant, 2014). Composite institu- 
tions (or individuals) involve presenting data in vignettes that 
contain data from more than one institution or individual. Com- 
posite institutions or individuals can be used as a means of 
consolidating and summarizing large quantities of data as well 
as highlighting similar patterns across several examples from 
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the data such as in a multiple case study when several cases 
share important characteristics. 


Mixed methods and Big Qual. A factor that can offer either oppor- 
tunities or constraints in big qual is whether the project is 
mixed methods. In some instances, the quantitative and quali- 
tative research designs are truly integrated and, in other 
instances, the qualitative research design is merely an “add-on” 
to a large quantitative project. Truly integrated designs may 
increase between-methods or mixed methods triangulation 
(Burke Johnson, Onwuegbuzie, & Turner, 2007; Denzin, 
1978), while quantitative and qualitative research designs can 
run parallel with little integration when qualitative research is 
seen merely as an add-on to a quantitative project. 

In addition, while it might be ideal for big qual research 
teams to be comprised entirely of researchers who subscribe 
to a pragmatist research paradigm, it may be unrealistic in real- 
world research settings for all members of a team to subscribe 
to this perspective. Instead, it may be more likely that, taken as 
a whole, the mixed methods team will subscribe to “dialectical 
pluralism,” which Creamer (2018) defines as “a paradigm that 
reflects what some consider to be the overarching logic of 
mixed methods: the deliberate engagement with different 
points of view and ways of achieving knowledge” (p. 245). 
Nonetheless, the dialectical pluralism perspective with its mix- 
ture of positivist, realist, social constructionist, and interpreti- 
vist research epistemologies can lead to either significant 
misunderstandings on mixed methods projects or research 
designs that are more collaborative, triangulated, and ulti- 
mately more rigorous. 


What Opportunities and Constraints Are Presented by 
Sampling Strategies Used in Big Qual Research Designs? 


We contrast the sampling strategies in quantitative and quali- 
tative research as the difference between probability sampling 
that seeks to establish generalizability by generalizing from a 
sample to a population and purposeful sampling that seeks to 
establish transferability by selecting information-rich cases for 
an in-depth understanding of a phenomenon (Lincoln & Guba, 
1985; Patton, 2015; Peshkin, 2001). From both a methodologi- 
cal and practical standpoint, important reasons remain for con- 
ducting deep single-site analyses within narrowly bounded, 
small n qualitative studies (Patton, 2015). 


Extending opportunities for generalization. Nonetheless, we pro- 
pose that large qualitative data sets may be gradually moving 
qualitative data away from purposeful sampling for transfer- 
ability in the direction of sampling for generalization (Maxwell 
& Chmiel, 2014; Polit & Beck, 2010). Big qual can employ a 
variety of qualitative sampling strategies, including single sig- 
nificant case sampling; comparison-focused sampling; group 
characteristics sampling; theory-focused and concept sam- 
pling; instrumental-use multiple case sampling; sequential and 
emergence-driven sampling; analytically focused sampling; 
and mixed, stratified, or nested sampling (Patton, 2015). 
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Though not all sampling strategies in big qual are alike, the 
authors’ research projects often employed maximum variation 
sampling. Though distinct from quantitative random sampling, 
maximum variation sampling does share a concern with the 
variability and representativeness of the sample. 

We argue that sampling for big qual affords significant 
opportunities for innovation. Many qualitative researchers have 
pointed to the capacity of qualitative research for generating 
theory (e.g., Charmaz, 2014; Corbin & Strauss, 2008; Denzin 
& Lincoln, 2017; Patton, 2015). We agree. In our experience, 
conducting big qual research suggests that large qualitative 
data sets and their sampling strategies can contribute to 
theory-building in unique ways. Qualitative research has tradi- 
tionally had the luxury of looking at small bounded systems 
(e.g., individuals, small groups, organizations). We often deli- 
mit qualitative research questions by stating that phenomena 
are beyond the scope of our study. However, large n studies 
may decrease the artificial boundaries we create for practical 
reasons around bounded systems. 


Adding depth and breadth to the analysis. Sampling in large n 
studies may lend itself to bounding cases more broadly by place 
and/or time (e.g., enactment of a federal education policy over 
a range of years in different states or student learning processes 
from preschool through graduate school). Because big qual 
allows us to conduct longitudinal qualitative research (E. 
Davidson & Weller, 2016), we can examine stages and cycles 
of social phenomena because “patterned periodicities provide 
us with a short- and long-term understanding of how the 
rhythms of life and work may proceed” (Miles, Huberman, & 
Saldafia, 2014, p. 211) rather than focusing more narrowly on 
phenomena from a cross-sectional perspective. 

In the DE project, for instance, the central finding of a con- 
ference paper based on a single year of cross-sectional big qual 
data shifted when it was further developed into a journal article 
based on multiple years of data. Specifically, the cross-sectional 
conference paper found that, from the perspective of adminis- 
trators and staff, the open-access mission of community colleges 
in Florida was compromised by sweeping state-level legislation 
focused on efficiency (Nix et al., 2016). However, the eventual 
journal article based on longitudinal data found that some cam- 
pus personnel became more accepting of reform efforts over the 
course of four years when they saw that the impact of Senate Bill 
1720 on equality was not as negative as they initially feared (Nix 
et al., in press). 

In addition, large n studies with many research participants 
at multiple research sites can adopt sampling strategies that 
help to explore a phenomenon nested within multiple units of 
analysis (e.g., how policy implementation unfolds at the state, 
institution, group, and individual levels or educational equity 
at the individual, classroom, school, district, and state levels). 
To some degree, qualitative research has always allowed us to 
study social phenomena at multiple units of analysis (Crea- 
mer, 2018; Yin, 2013). However, like binoculars that can 
focus near or far, large qualitative data sets enhance this func- 
tion by allowing us to dial down to closely examine 


phenomena at the micro or individual level and then dial out 
to view phenomena at the macro or societal level. In this way, 
researchers can begin to discover large-scale patterns linking 
each level of analysis to a coherent whole. Thus, with the 
significant variation that is now possible in large qualitative 
data sets, coupled with the human pattern-identification capa- 
bility, it may be possible to examine social phenomena out- 
side narrowly bounded systems to generate more elaborate, 
big picture theories from data. 

For instance, a manuscript about students with stigmatized 
and minoritized identities linked student experiences to efforts 
of staff to assist these students in managing their stigma to 
persist and succeed in community college. This article also 
linked student and staff interactions at the individual unit of 
analysis to a broader institutional ethic of care as well as the 
open-access community college mission at the institutional unit 
of analysis (Brower et al., 2018). 

In addition, big qual that are part of mixed methods projects 
can improve theory by initially theorizing about large systems 
or hypothesizing about emerging constructs and associations 
among constructs, which quantitative research can then verify, 
extend, or disprove. 

An example of theory-building from the DE reform project 
was a qualitatively derived typology of four broad policy 
implementation patterns (oppositional, circumventing, satisfi- 
cing, and facilitative implementation) developed in conjunc- 
tion with 14 specific behaviors such as improvising and leaving 
the institution (Brower et al., 2017). An empirically grounded 
typology, or classification of a social phenomenon according to 
type, can be derived from qualitative data using methods such 
as those developed by Kluge (2000). Qualitative typologies can 
be either “indigenous typologies,” derived from research parti- 
cipants’ in vivo classifications of their own cultural settings, 
“analyst-constructed typologies,” derived from researchers’ 
identification of patterns in qualitative data or a mixture of 
both (Patton, 2015). While the case was specific to an individ- 
ual higher education policy in a particular state context, the 
breadth and depth of the data likely increased the researchers’ 
ability to identify the four broad patterns as well as the potential 
transferability of the theory to other education policies and 
policy domains. Thus, the big qual sampling strategy helped 
to ensure that the typology was both exhaustive and compre- 
hensive, increasing the likelihood that all implementation pat- 
terns and behaviors had been identified. In addition, the breadth 
of the data allowed researchers to distinguish between wide- 
spread and infrequent implementation behaviors. A small n 
qualitative study with data from only three institutions would 
likely have identified a frequently coded behavior like funda- 
mental rule change or improvisation but might have failed 
entirely to identify Jeaving the organization as an implemen- 
tation behavior because it appeared in the data only 3 times 
across all institutions. 


Contextual details in Big Qual. Despite the strengths of big qual 
research designs, our experience with large n qualitative stud- 
ies is that with so many participants in the sample, the 


researchers who collect the data can be more likely to remem- 
ber contextual details from the most vivid interview partici- 
pants and research environments. Therefore, even with 
thorough field notes, some of the contextual richness of field 
work can be lost in large qualitative studies. As qualitative 
researchers, we have not abandoned the language of transfer- 
ability. However, we suggest it may be time to begin posing the 
following question: If big qual is less focused on examining 
unique and information-rich cases, is the sampling logic slowly 
moving away from transferability in the direction of 
generalization? 


What Opportunities and Constraints Are Presented by 
Team-Based Data Analysis Processes Facilitated by 
Technology in Big Qual? 


Inter-coder reliability is a way to ensure coding consistency 
and agreement among a team of researchers (Bazeley & Jack- 
son, 2013; J. Davidson et al., 2016). We argue that one draw- 
back of establishing inter-coder reliability in a large qualitative 
project is that coding for abstract concepts or sensitizing con- 
cepts (Patton, 2015), which can be essential in generating the- 
ory, may be de-emphasized. For instance, it may never be 
possible for a team to reach agreement on the definition of 
abstract yet essential terms such as “metacognition” or “policy 
entrepreneur.” We recognize that while new technological fea- 
tures of CAQDAS greatly facilitate coding (Bazeley & Jack- 
son, 2013; J. Davidson et al., 2016), they can also constrain 
coding by making it normative for teams of researchers to use 
features such as the Cohen’s k coefficient function to establish 
reliability among team members. Burla et al. (2008) and Everitt 
(1996) have reported that « ranges of .41—.60 represent mod- 
erate intercoder reliability, values greater than .60 indicate 
satisfactory reliability, and values greater than .80 represent 
nearly perfect reliability. However, « coefficients are not the 
only quality criterion, and Hai-Jew (2017) has pointed out that 
« scores tend to decrease with a large number of codes and a 
large number of researchers. Therefore, we suggest that while 
Cohen’s « coefficient ranges above .60 may be ideal for big 
qual projects, the real value in calculating « coefficients lies in 
the discussions that necessarily take place among researchers 
related to their differing understandings of codes, the defini- 
tions of codes, and how these definitions apply to the data. 

Nevertheless, precise definitions for codes, which are nec- 
essary in establishing reliability and consistency in coding 
among teams of researchers, may contribute to greater clarity 
with respect to the concepts present in the data. Thus, while the 
de-emphasis on abstraction can detract from theory-building 
processes, the greater precision, clarity, and creativity of group 
processes may be beneficial to identifying patterns in qualita- 
tive data and ultimately to theory-building. 


Induction, teams, and technology. Some aspects of team-based 
data analysis in big qual can make inductive research designs 
challenging. Inductive methods begin, 
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with specific observations and builds toward general pat- 
terns. ... The strategy of inductive designs is to allow the important 
analysis dimensions to emerge from patterns found in the cases 
under study without presupposing in advance what the important 
dimensions will be. (Patton, 1990, p. 56). 


It can be difficult to employ an inductive process when the 
coding framework in large qualitative data sets typically con- 
sists of more a priori codes than emergent codes. The emergent 
nature of identifying patterns in the qualitative data analysis 
process is described as 


more than just a paraphrasing. ...It is more than just noting con- 
cepts in the margins of the field notes or making a list of codes as in 
a computer program. Identifying patterns involves interacting with 
the data using techniques such as asking questions about the data, 
making comparisons between data, and then developing those con- 
cepts in terms of their properties and dimensions. (Corbin & 
Strauss, 2008, p. 66) 


Charmaz (2014) has similarly observed that coding “generates 
the bones of your analysis. . . . [I]ntegration will assemble those 
bones into a working skeleton” (p. 45). Likewise, Bernard 
(2011) describes analysis as “the search for patterns in data 
and for ideas that help explain why those patterns are there 
in the first place” (p. 388). 

We argue that certain systematic aspects of data analysis 
with a research team may move the pattern identification pro- 
cess of qualitative research away from emergent data analysis. 
Technical difficulties can arise in the software file when a 
coding framework evolves significantly over time with a team 
of researchers. This relative lack of flexibility in the coding 
framework can make it more daunting to identify high-level 
patterns in the data. Due to these difficulties, an individual 
researcher working alone may have a greater ability to employ 
emergent coding by changing the coding structure as the proj- 
ect progresses (Saldafia, 2013). 


The underlying “reality” in the data. Perhaps more importantly, 
the necessity of assigning codes to text with precise definitions 
tends to assume there is one underlying reality in the data 
instead of many possible realities. As Stake (1995) observed, 
“most qualitative researchers not only believe that there are 
multiple perspectives or views that need to be represented, but 
that there is no way to establish, beyond contention, the best 
view” (p. 108). 

Another challenge that arises with team-based data analysis 
is related to interpreting data from the efic (cultural outsider 
perspective) versus the emic (indigenous or cultural insider 
perspective) (Denzin & Lincoln, 2017; Pike, 1967) and the 
co-construction of meaning between participant and 
researcher. In some instances, a subset of the research team 
in big qual projects will go into the field to collect data, but a 
larger team of researchers will analyze that data. However 
comprehensive the interview transcripts and/or field notes 
might be, researchers who did not collect data in the field lack 


Brower et al. 


the benefit of entering into the participants’ cultural setting and 
do not have the benefit of recalling contextual details (e.g., 
body language, facial expressions, details in the research envi- 
ronment). Because the number of researchers who did not go 
into the field can sometimes outnumber the researchers who 
collected the data on a big qual project, the co-construction of 
meaning with the participant can also decrease in the analysis 
process when the etic perspective is emphasized. 

In addition, depending on the roles that researchers have 
played on a big qual project in terms of the amount of data 
they have collected and analyzed and their number of years 
working on a longitudinal project, some researchers on the 
team may have an advantaged position. Specifically, some of 
the researchers will have a better “bird’s eye view” of the 
totality of the data, including the themes that cut across years 
of the project, institutions, and participant groups and how 
those themes have evolved over time. 


Interpreting data through consensus. Despite these constraints, we 
suggest that group coding may also result in a more collabora- 
tive, inclusive, and creative process than individual coding. 
Weston et al. (2001), for instance, suggest that “a research team 
builds codes and coding builds a team through the creation of 
shared interpretation and understanding of the phenomenon 
being studied” (p. 382). We have found that arriving at a “best 
interpretation” of data with a team can initially be a time- 
consuming process. However, our experience conducting data 
analysis with multiple researchers eventually results in a con- 
sensual interpretation that is more comprehensive and nuanced 
than the interpretation of a single researcher. 


Implications of the Debate 


Recent technological advances in CAQDAS (Bazeley & Jack- 
son, 2013; J. Davidson et al., 2016) and the increasing willing- 
ness of government and private foundations to fund large 
qualitative projects (Cheek, 2008; Plano Clark, 2010) make 
this interdisciplinary discussion increasingly essential across 
academic disciplines. Already, we have seen new research 
methods for big qual diffuse to diverse fields such as health 
sciences and medicine, business, education, environmental sci- 
ence, sociology, social work, anthropology, agriculture, and 
information science (e.g., Armstrong, Riemenschneider, 
Nelms, & Reid, 2012; Guest & MacQueen, 2008). 

We present this debate about large qualitative data sets as a 
meta-cognitive process intended to spark a broader discussion 
about whether big qual methods could someday become a qua- 
litative tradition grounded in philosophical underpinnings. In 
sparking this discussion, we argue that the benefits of big qual 
include a more inclusive, collaborative analytic process and 
increased transferability, breadth, depth, and theory-building 
potential. The challenges of big qual include a de-emphasis 
on abstract concepts in the coding framework, a possible 
decrease in the contextual depth in the data set, and potentially 
an emphasis on the etic researcher perspective. 


These benefits and challenges require us to educate students 
about the traditional aims of qualitative inquiry and to initiate 
discussions with other researchers about how these aims may 
be evolving with the introduction of new methods and techno- 
logical advances in data analysis software. We contend that in 
order to do this, we must strive to make our methods and their 
underlying assumptions as explicit as possible. Moreover, it 
requires us to think deeply about the “whys” and “hows” as 
we continue to engage in discussions around big qualitative 
inquiry. 


Future Research Directions 


Osborne-Lampkin, Cohen-Vogel, Feng, and Wilson (2018) 
comment on the use of theory in guiding scientific inquiry: 


Theory provides the ideas researchers use about how phenomenon 
operate in the world. Empirical studies then tests those theories, 
using findings from them to modify or refine the theory. . . . Frame- 
works help researchers set forth predictions about their study out- 
comes, shape the study design, and once data are collected, are 
used as a “mirror to check whether the findings agree with the 
framework or whether there are discrepancies.” (p. 189) 


An understanding of the undergirded theory for qualitative 
methodologies holds promise for not only conceptualizing 
questions of inquiry but also for developing research designs 
that will enable us to connect the research to policy and prac- 
tice. In fact, we assert that it is the understanding of theory 
behind qualitative methodologies that will provide opportunity 
for creativity in further exploring big qual processes in under- 
standing how to do this work, how to “use the theories in new 
ways and in different research contexts” (Osborne-Lampkin, 
Cohen-Vogel, Feng, & Wilson, 2018, p. 22), and how to better 
situate our research to inform the field. 

Despite the recent focus on understanding the ways in which 
practitioners and policy makers define, acquire, interpret, and 
ultimately use research in education, an increased understand- 
ing of how practitioners and policy makers make sense of and 
use research continues to be an area ripe for scientific study 
(Tseng, 2012). We contend that collaboration between practi- 
tioners and researchers in developing frameworks and pro- 
cesses for research designs that use large qualitative data sets 
can enhance or support the development of innovative metho- 
dological designs and approaches as well as support efforts to 
ensure that the “supply-side” attempts of researchers to address 
the “demand-side” needs of the “end user” are met (Tseng, 
2012). Moreover, illuminating frameworks and designs, and 
developing tool kits that outline methodological procedures 
can also increase the rigor of research being conducted and the 
use by practitioners, policy makers, and researchers, alike. 

Our questions about the considerations and opportunities for 
big qual data emerged through our work as researchers engaged 
in large-scale qualitative and mixed methods studies. Yet less is 
known about the enabling structures and supports that facilitate 
this work. Additional research around the organizational 


structures such as the methodological teams, frameworks, and 
procedures for carrying out this work deserve additional 
examination. 

In moving forward, we must ask ourselves: Are there oppor- 
tunities to not only adjust theory and methodological 
approaches to better-fit big qual designs but also build upon 
theoretical knowledge? Additional research around the inter- 
sections and divergences of big qual in purely qualitative stud- 
ies and varying types of mixed methods (e.g., qual-quant, 
quant-qual) studies is still needed. Also, how do we apply that 
knowledge to develop research questions, design studies, and 
analytical approaches to enhance our ability to conceptualize 
and carry out research that better informs policy and practice, 
either for multiple purposes or for narrowly tailored research 
questions? Future studies will also need to determine the aca- 
demic disciplines most likely to employ big qual methods and 
to explore the emerging research conventions of each by study- 
ing the organizational structures, theoretical frameworks, and 
most common procedures used by researchers. 


Conclusion 


Engaging in large-scale qualitative or mixed methods data col- 
lection and analysis is nothing new. We propose that many 
newly introduced and still evolving conventions for large qua- 
litative data sets may be changing the fundamental nature of 
some qualitative research. Thus, we advocate for a frank dis- 
cussion of research methods to preserve the traditional nature 
of constructivist qualitative inquiry while remaining open to 
opportunities for innovation. We acknowledge that our ques- 
tions may have varying saliency depending on where research- 
ers situate themselves on the continuum of perspectives on 
qualitative research and hope that these questions spark further 
debate regarding the nature of qualitative inquiry. 
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Note 


1. Our definition represents an effort towards formulating an initial 
conception of Big Qual. We acknowledge that just as the definition 
of mixed methods has evolved over time, so too will the field’s 
definition of Big Qual. Therefore, we welcome suggestions from 
readers. 
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